Search CORE

71,998 research outputs found

Efficient resources assignment schemes for clustered multithreaded processors

Author: Fernando Latorre
González Colás Antonio María
González González José
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2008
Field of study

New feature sizes provide larger number of transistors per chip that architects could use in order to further exploit instruction level parallelism. However, these technologies bring also new challenges that complicate conventional monolithic processor designs. On the one hand, exploiting instruction level parallelism is leading us to diminishing returns and therefore exploiting other sources of parallelism like thread level parallelism is needed in order to keep raising performance with a reasonable hardware complexity. On the other hand, clustering architectures have been widely studied in order to reduce the inherent complexity of current monolithic processors. This paper studies the synergies and trade-offs between two concepts, clustering and simultaneous multithreading (SMT), in order to understand the reasons why conventional SMT resource assignment schemes are not so effective in clustered processors. These trade-offs are used to propose a novel resource assignment scheme that gets and average speed up of 17.6% versus Icount improving fairness in 24%.Peer ReviewedPostprint (published version

UPCommons. Portal del coneixement obert de la UPC

Frontend frequency-voltage adaptation for optimal energy-delay/sup 2/

Author: González Colás Antonio María
González González José
Grigorios Magklis
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2004
Field of study

In this paper, we present a clustered, multiple-clock domain (CMCD) microarchitecture that combines the benefits of both clustering and globally asynchronous locally synchronous (GALS) designs. We also present a mechanism for dynamically adapting the frequency and voltage of the frontend of the CMCD with the goal to optimize the energy-delay/sup 2/ product (ED2P). Our mechanism has minimal hardware cost, is entirely self-adjustable, does not depend on any thresholds, and achieves results close to optimal. We evaluate it on 16 SPEC 2000 applications and report 17.5% ED2P reduction on average (80% of the upper bound).Peer ReviewedPostprint (published version

UPCommons. Portal del coneixement obert de la UPC

The student evaluation of teaching and the competence of students as evaluators

Author: Dorta-González María Isabel
Dorta-González Pablo
Publication venue
Publication date: 01/01/2013
Field of study

When the college student satisfaction survey is considered in the promotion and recognition of instructors, a usual complaint is related to the impact that biased ratings have on the arithmetic mean (used as a measure of teaching effectiveness). This is especially significant when the number of students responding to the survey is small. In this work a new methodology, considering student to student perceptions, is presented. Two different estimators of student rating credibility, based on centrality properties of the student social network, are proposed. This method is established on the idea that in the case of on-site higher education, students often know which others are competent in rating the teaching and learning process.Comment: 20 pages, 2 table

arXiv.org e-Print Archive

DIALNET

Control speculation for energy-efficient next-generation superscalar processors

Author: Aragón Juan Luis
González Colás Antonio María
González González José
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2006
Field of study

Conventional front-end designs attempt to maximize the number of "in-flight" instructions in the pipeline. However, branch mispredictions cause the processor to fetch useless instructions that are eventually squashed, increasing front-end energy and issue queue utilization and, thus, wasting around 30 percent of the power dissipated by a processor. Furthermore, processor design trends lead to increasing clock frequencies by lengthening the pipeline, which puts more pressure on the branch prediction engine since branches take longer to be resolved. As next-generation high-performance processors become deeply pipelined, the amount of wasted energy due to misspeculated instructions will go up. The aim of this work is to reduce the energy consumption of misspeculated instructions. We propose selective throttling, which triggers different power-aware techniques (fetch throttling, decode throttling, or disabling the selection logic) depending on the branch prediction confidence level. Results show that combining fetch-bandwidth reduction along with select-logic disabling provides the best performance in terms of overall energy reduction and energy-delay product improvement (14 percent and 10 percent, respectively, for a processor with a 22-stage pipeline and 16 percent and 13 percent, respectively, for a processor with a 42-stage pipeline).Peer ReviewedPostprint (published version

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

UPCommons. Portal del coneixement obert de la UPC

Recommended from our members

Benson Snippets: Digitized Copies of Books from Latin American Collection Appear Online

Author: González Marinas María Elena
Publication venue: Teresa Lozano Long Institute of Latin American Studies
Publication date: 01/01/2008
Field of study

Latin American Studie

Texas ScholarWorks

Virtual-physical registers

Author: González Colás Antonio María
González González José
Valero Cortés Mateo
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/1998
Field of study

A novel dynamic register renaming approach is proposed in this work. The key idea of the novel scheme is to delay the allocation of physical registers until a late stage in the pipeline, instead of doing it in the decode stage as conventional schemes do. In this way, the register pressure is reduced and the processor can exploit more instruction-level parallelism. Delaying the allocation of physical registers require some additional artifact to keep track of dependences. This is achieved by introducing the concept of virtual-physical registers, which do not require any storage location and are used to identify dependences among instructions that have not yet allocated a register to its destination operand. Two alternative allocation strategies have been investigated that differ in the stage where physical registers are allocated: issue or write-back. The experimental evaluation has confirmed the higher performance of the latter alternative. We have performed all evaluation of the novel scheme through a detailed simulation of a dynamically scheduled processor. The results show a significant improvement (e.g., 19% increase in IPC for a machine with 64 physical registers in each file) when compared with the traditional register renaming approach.Peer ReviewedPostprint (published version

UPCommons. Portal del coneixement obert de la UPC

Using MCD-DVS for dynamic thermal management performance improvement

Author: Chaparro Pedro
González Colás Antonio María
González González José
Magklis Grigorios
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2006
Field of study

With chip temperature being a major hurdle in microprocessor design, techniques to recover the performance loss due to thermal emergency mechanisms are crucial in order to sustain performance growth. Many techniques for power reduction in the past and some on thermal management more recently have contributed to alleviate this problem. Probably the most important thermal control technique is dynamic voltage and frequency scaling (DVS) which allows for almost cubic reduction in power with worst-case performance penalty only linear. So far, DVS techniques for temperature control have been studied at the chip level. Finer grain DVS is feasible if a globally-asynchronous locally-synchronous (GALS) design style is employed. GALS, also known as multiple-clock domain (MCD), allows for an independent voltage and frequency control for each one of the clock domains that are part of the chip. There are several studies on DVS for GALS that aim to improve energy and power efficiency but not temperature. This paper proposes and analyses the usage of DVS at the domain level to control temperature in a clustered MCD microarchitecture with the goal of improving the performance of applications that do not meet the thermal constraints imposed by the designers.Peer ReviewedPostprint (published version

UPCommons. Portal del coneixement obert de la UPC

Negotiation of meaning in outside of the classroom group assignments: accounting for the how to understand the what of future mathematics teachers' learning

Author: González María José
Gómez Pedro
Mesa Vilma María
Publication venue: Pitagora
Publication date: 01/01/2011
Field of study

In this paper we illustrate how Wenger’s theory of social learning can be used to account for phenomena of future teachers change in settings that are not usually studied, namely group work that future teachers do as they work on class assignments outside of class. We describe how we adapted Wenger’s theory to the exploration of future mathematics teachers’ learning and illustrate how the analysis of the audio taped interaction of a group of future teachers working out-side the classroom generated conjectures that help to explain their didactic knowledge development

Funes

X-ray/gamma-ray flux correlations in the BL Lacs Mrk 421 and 501 using HAWC data

Author: Fraija Nissim
García-González J. A.
González María Magdalena
Publication venue
Publication date: 30/08/2017
Field of study

The HAWC gamma ray observatory is located at the Sierra Negra Volcano in Puebla, Mexico, at an altitude of 4,100 meters. HAWC is a wide field of view array of 300 water Cherenkov detectors that are continuously surveying ~ 2sr of the sky, operating since March 2015. The large collected data sample allows HAWC to perform an unbiased monitoring of the BL Lac Mrk 421. This is the closest and brightest known extragalactic high-synchrotron-peaked BL Lac in the gamma-ray/X- ray bands and is extensively monitored by the Large Area Telescope (LAT) on-board the Fermi satellite, and the BAT and XRT instruments of the Swift satellite. In this work, we use 25 months of HAWC data together with Swift-XRT data to characterize potential correlations between both wavelengths. This analysis shows that HAWC and Swift-XRT data are correlated even stronger than expected for quasi-simultaneous observations.Comment: Presented at the 35th International Cosmic Ray Conference (ICRC2017), Bexco, Busan, Korea. See arXiv:1708.02572 for all HAWC contribution

arXiv.org e-Print Archive

Crossref